Power Quality Data Compression Using Principal Component Analysis

نویسندگان

  • Huaying ZHANG
  • Zhengguo ZHU
  • Senjing YAO
  • Bingbing ZHAO
  • Junwei CAO
چکیده

With the increasing of non-linear, burst or un-balanced load, power quality issues in the grid is becoming important. With more power quality monitors installed with higher sampling rates, an expanded size of power quality data brings difficulty to storage, transmission and analysis. In this paper, principal component analysis (PCA), which is a popular feature extraction algorithm in pattern recognition, is applied to power quality event data compression. In a power grid, different nodes and phases normally have high correlations, and PCA projects original data to a lower dimensional space to reduce redundancy. The compression ratio is determined by the number of principal components. With more principal components, the error of data recovery is reduced. We also compare the performance of two derivative algorithms, probabilistic PCA (PPCA) and kernel PCA (KPCA). Experimental results show smaller errors with higher complexity, comparing PPCA and KPCA with PCA. INTRODUCTION Ideal signals of a power grid should be perfect sine waves with constant frequency. In three phases AC, the voltage and current of each phase are expected to be symmetric. However, with the increasing of non-linear, burst or unbalanced load, power quality issues in the grid is becoming important. Nowadays, power quality problems can have a threat to safety and stability of the power grid, resulting in huge financial loss [1]. Since some power quality issues are transient, e.g. voltage sags, data sampling for power quality monitoring requires high frequency, resulting in large amount of data. For example, the city Shenzhen of P. R. China has deployed over 600 power quality monitoring nodes for several years [2]. Management and analysis of such big data is becoming a challenging issue. The large volume of power quality monitoring data brings difficulty to storing, transmitting, querying and data mining. Applications of such big data for advanced analysis are indeed required, though currently with very high overhead. More data may not show more information, since correlations may bring redundancy. In this work, features of power quality monitoring data are further investigated. We believe there are high correlations among different channels and nodes of power quality monitoring data, which make data compression or feature extraction possible. For example, power quality events, happening simultaneously among the three phases, are highly correlated. A highly efficient data compression method should be able to reduce overhead for data storage and analysis. Data compression is not a new research topic, and many approaches have been investigated to reduce data sizes by improve coding scheme such as Huffman coding. In recent years, many methods have been proposed specially to compress power quality data using wavelet and wavelet packet transforms [3-6]. In [3] and [4], compression is achieved by thresholding wavelet transform coefficients and reconstructing the signal using significant coefficients. Using the same transform thresholding techniques, variations of the wavelets, such as slantlets, are used in [5]. In [6], minimum description length criterion is used for compressing with wavelet packets. In this work, principal component analysis (PCA) [7] is adopted for power quality event data compression. PCA is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables, so-called principal components. In the second section, we will give a brief introduction to PCA, and its derivative algorithms, PPCA (Probabilistic Principal Component Analysis) [8] and KPCA (Kernel Principal Component Analysis) [9]. PPCA can better keep features of sample data, instead of simply removing nonprincipal components, which improve performance on succeed recognition and classification. KPCA map the data to Hilbert space by using kernel function, which makes it easier to extract principal components. In the third section, we will compress real PQ data using these three algorithms with reconstruction. Performance metrics include compression ratio and recovery error. The paper is concluded in the fourth section with an introduction to future research directions. PCA, KPCA AND PPCA PCA Suppose that we have N samples of n -dimension vector x , and each row is a sample, column is 1 2 , , m x x x . We wish to reduce the dimension from n to m . Principal component analysis completes this by finding linear combinations, 1 1 2 2 , , m m a x a x a x , called principal 23rd International Conference on Electricity Distribution Lyon, 15-18 June 2015 Paper 0458 CIRED 2015 2/5 components, which have maximum variance, and subject to being uncorrelated with previous principal components. The PCA tries to reduce dimensions of data considerably while still retaining much of the information in it. Figure 2.1 shows that how PCA works. The principal component orientation (also the signal) has maximum variance compared to non-principal component (also the noise). Figure 2.1. An illustration of PCA Specific steps of PCA are derived as follows: Normalize the sample data by: 2 ( ) ( ) ( ) ( ) i i i i x t x x t x     where: ( ) i x  is the mean of i x ( ) i x  is the standard deviation of i x Compute the covariance matrix of sample data after normalization: T XX Compute the eigenvalues and eigenvectors of T XX and sort all eigenvalues. Select the corresponding eigenvectors the biggest m eigenvalues as principal component orientation. 1 2 , , m    Then we compute the projection of sample data on principal component (also the compressed data): 1 2 ( ) [ , , ] m T y t X     PPCA Compared with traditional PCA, PPCA get more information from non-principal components instead of simply discarding it. PPCA suppose the non-principal components as noise which is subject to Gaussian distribution. By using maximum likelihood estimation (MLE), we can get the parameters of the distribution. Specific steps of PPCA is derived as follows: Each sample data is a n -dimension vector, and there is a m -dimension( m < n ) vector t which satisfies t Wx      where: W is a n m  matrix  is the sample mean  is noise Hidden variable x is subject to Gaussian distribution ~ (0, ) x N I 1 N i i x N    , ~ (0, ) N   is the diagonal

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring the primordial power spectrum: Principal component analysis of the cosmic microwave background

We implement and investigate a method for measuring departures from scaleinvariance, both scale-dependent as well as scale-free, in the primordial power spectrum of density perturbations using cosmic microwave background (CMB) Cl data and a principal component analysis (PCA) technique. The primordial power spectrum is decomposed into a dominant scale-invariant Gaussian adiabatic component plus ...

متن کامل

A Critique on Power Spectrum – Area Fractal Method for Geochemical Anomaly Mapping

Power spectrum – area fractal (S-A fractal) method has been frequently applied for geochemical anomaly mapping. Some researchers have performed this method for separation of geochemical anomaly, background and noise and have delineated their distribution maps. In this research, surface geochemical data of Zafarghand Cu-Mo mineralization area have been utilized and some defects of S-A fractal me...

متن کامل

Sparse Structured Principal Component Analysis and Model Learning for Classification and Quality Detection of Rice Grains

In scientific and commercial fields associated with modern agriculture, the categorization of different rice types and determination of its quality is very important. Various image processing algorithms are applied in recent years to detect different agricultural products. The problem of rice classification and quality detection in this paper is presented based on model learning concepts includ...

متن کامل

Modelling of some soil physical quality indicators using hybrid algorithm principal component analysis - artificial neural network

One of the important issues in the analysis of soils is to evaluate their features. In estimation of the hardly available properties, it seems the using of Data mining is appropriate. Therefore, the modelling of some soil quality indicators, using some of the early features of soil which have been proved by some researchers, have been considered. For this purpose, 140 disturbed and 140 undistur...

متن کامل

Principal Component Analysis for Soil Conservation Tillage vs Conventional Tillage in Semi Arid Region of Punjab Province of Pakistan

Principal component analysis is a valid method used for data compression and information extraction in a given set of experiments. It is a well-known classical data analysis technique. There are a number of algorithms for solving the problems, some scaling better than others. Wheat ranks as the staple food of most of the nations as well as an agent of poverty reduction, food security and world ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015